Airlines Review Analysis

This notebook is dedicated to perform an Exploratory Data Analysis on an airlines review dataset. The goal of this analysis is to uncover insights on customer satisfaction and factors that affect airline rating. By examining reviews from various perspectives, this analysis aims to provide actionable insights into what factors most influence passengers’ decisions to recommend an airline, as well as how verified and unverified reviews differ in their impact on overall ratings.


Importing necessary libraries

In [1]:
import pandas as pd
import re
from textblob import TextBlob
from wordcloud import WordCloud, STOPWORDS
from collections import Counter
import requests
import plotly.io as pio
import plotly.express as px
import plotly.graph_objects as go
import plotly.colors as colors
from plotly.subplots import make_subplots


Reading in data from the csv

In [2]:
dataset_path = 'airline reviews_countries updated.csv'
df = pd.read_csv(dataset_path)
df.head()
Out[2]:
Overall_Rating Unnamed Airline Name Review_Title Review Date Verified Review Aircraft Type Of Traveller Seat Type ... Country of Arrival Date Flown Seat Comfort Cabin Staff Service Food & Beverages Ground Service Inflight Entertainment Wifi & Connectivity Value For Money Recommended
0 9 0 AB Aviation "pretty decent airline" 11th November 2019 True Moroni to Moheli. Turned out to be a pretty de... NaN Solo Leisure Economy Class ... Comoro Islands November 2019 4.0 5.0 4.0 4.0 NaN NaN 3.0 yes
1 1 1 AB Aviation "Not a good airline" 25th June 2019 True Moroni to Anjouan. It is a very small airline.... E120 Solo Leisure Economy Class ... Comoro Islands June 2019 2.0 2.0 1.0 1.0 NaN NaN 2.0 no
2 1 2 AB Aviation "flight was fortunately short" 25th June 2019 True Anjouan to Dzaoudzi. A very small airline and ... Embraer E120 Solo Leisure Economy Class ... Mayotte June 2019 2.0 1.0 1.0 1.0 NaN NaN 2.0 no
3 1 3 Adria Airways "I will never fly again with Adria" 28th September 2019 False Please do a favor yourself and do not fly with... NaN Solo Leisure Economy Class ... Kosovo September 2019 1.0 1.0 NaN 1.0 NaN NaN 1.0 no
4 1 4 Adria Airways "it ruined our last days of holidays" 24th September 2019 True Do not book a flight with this airline! My fri... NaN Couple Leisure Economy Class ... Netherlands September 2019 1.0 1.0 1.0 1.0 1.0 1.0 1.0 no

5 rows × 24 columns


Get a summary of the data

In [3]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 23171 entries, 0 to 23170
Data columns (total 24 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   Overall_Rating          23171 non-null  object 
 1   Unnamed                 23171 non-null  int64  
 2   Airline Name            23171 non-null  object 
 3   Review_Title            23171 non-null  object 
 4   Review Date             23171 non-null  object 
 5   Verified                23171 non-null  bool   
 6   Review                  23171 non-null  object 
 7   Aircraft                7108 non-null   object 
 8   Type Of Traveller       19433 non-null  object 
 9   Seat Type               22075 non-null  object 
 10  Route                   19343 non-null  object 
 11  Departure               23171 non-null  object 
 12  Arrival                 23171 non-null  object 
 13  Country of Departure    23171 non-null  object 
 14  Country of Arrival      23171 non-null  object 
 15  Date Flown              19417 non-null  object 
 16  Seat Comfort            19016 non-null  float64
 17  Cabin Staff Service     18911 non-null  float64
 18  Food & Beverages        14500 non-null  float64
 19  Ground Service          18378 non-null  float64
 20  Inflight Entertainment  10829 non-null  float64
 21  Wifi & Connectivity     5920 non-null   float64
 22  Value For Money         22105 non-null  float64
 23  Recommended             23171 non-null  object 
dtypes: bool(1), float64(7), int64(1), object(15)
memory usage: 4.1+ MB
In [4]:
df.shape
Out[4]:
(23171, 24)


Data cleaning

In [5]:
# checking for duplicates
duplicates = df.duplicated()
print(f"Number of duplicate rows: {duplicates.sum()}")

df[duplicates]
Number of duplicate rows: 0
Out[5]:
Overall_Rating Unnamed Airline Name Review_Title Review Date Verified Review Aircraft Type Of Traveller Seat Type ... Country of Arrival Date Flown Seat Comfort Cabin Staff Service Food & Beverages Ground Service Inflight Entertainment Wifi & Connectivity Value For Money Recommended

0 rows × 24 columns

In [6]:
# dropping columns that will not be utilised
df.drop(['Overall_Rating', 'Departure', 'Arrival', 'Route'], axis=1, inplace=True)
In [7]:
# checking for null values
df.isnull().sum()
Out[7]:
Unnamed                       0
Airline Name                  0
Review_Title                  0
Review Date                   0
Verified                      0
Review                        0
Aircraft                  16063
Type Of Traveller          3738
Seat Type                  1096
Country of Departure          0
Country of Arrival            0
Date Flown                 3754
Seat Comfort               4155
Cabin Staff Service        4260
Food & Beverages           8671
Ground Service             4793
Inflight Entertainment    12342
Wifi & Connectivity       17251
Value For Money            1066
Recommended                   0
dtype: int64
In [8]:
# filling in missing values for categorical columns that will be utilised in this analysis

mode_type_of_traveller = df['Type Of Traveller'].mode()[0]
df['Type Of Traveller'].fillna(mode_type_of_traveller, inplace=True)

mode_seat_type = df['Seat Type'].mode()[0]
df['Seat Type'].fillna(mode_seat_type, inplace=True)

numerical_columns = df.select_dtypes(include=['float64']).columns

# filling in missing values for numerical columns that will be utilised in this analysis

for column in numerical_columns:
    group_means = df.groupby('Airline Name')[column].transform('mean')
    group_means.fillna(df[column].mean(), inplace=True)
    df[column] = df[column].fillna(group_means).round(0).astype(float)
    
# filling in missing values for Date Flown using backward fill
df['Date Flown'] = df.groupby('Airline Name')['Date Flown'].fillna(method='ffill')
df['Date Flown'] = df.groupby('Airline Name')['Date Flown'].fillna(method='bfill')

df.isnull().sum()
Out[8]:
Unnamed                       0
Airline Name                  0
Review_Title                  0
Review Date                   0
Verified                      0
Review                        0
Aircraft                  16063
Type Of Traveller             0
Seat Type                     0
Country of Departure          0
Country of Arrival            0
Date Flown                  310
Seat Comfort                  0
Cabin Staff Service           0
Food & Beverages              0
Ground Service                0
Inflight Entertainment        0
Wifi & Connectivity           0
Value For Money               0
Recommended                   0
dtype: int64
In [9]:
df['Route'] = df['Country of Departure'] + ' to ' + df['Country of Arrival']
In [10]:
# re-adding the previously dropped ‘Overall Ratings’ column, this time by calculating it as the sum of the Service Category ratings

df['Overall Rating'] = df[['Seat Comfort', 'Cabin Staff Service', 'Food & Beverages', 'Ground Service', 'Inflight Entertainment', 'Wifi & Connectivity', 'Value For Money']].sum(axis=1)
df.head(2)
Out[10]:
Unnamed Airline Name Review_Title Review Date Verified Review Aircraft Type Of Traveller Seat Type Country of Departure ... Seat Comfort Cabin Staff Service Food & Beverages Ground Service Inflight Entertainment Wifi & Connectivity Value For Money Recommended Route Overall Rating
0 0 AB Aviation "pretty decent airline" 11th November 2019 True Moroni to Moheli. Turned out to be a pretty de... NaN Solo Leisure Economy Class Comoro Islands ... 4.0 5.0 4.0 4.0 2.0 2.0 3.0 yes Comoro Islands to Comoro Islands 24.0
1 1 AB Aviation "Not a good airline" 25th June 2019 True Moroni to Anjouan. It is a very small airline.... E120 Solo Leisure Economy Class Comoro Islands ... 2.0 2.0 1.0 1.0 2.0 2.0 2.0 no Comoro Islands to Comoro Islands 12.0

2 rows × 22 columns


Which airlines receive the highest and lowest ratings, in terms of average overall ratings?

In [11]:
# calculate the mean overall rating, grouped by airline name 
airline_ratings = df.groupby('Airline Name')['Overall Rating'].mean().reset_index().round(1)

airline_ratings = airline_ratings.sort_values(by='Overall Rating', ascending=False)

top_5_airlines = airline_ratings.head(5).copy()
bottom_5_airlines = airline_ratings.tail(5).copy()

# new columns for top and bottom airlines
top_5_airlines['Type'] = 'Top 5 Airlines'
bottom_5_airlines['Type'] = 'Bottom 5 Airlines'

combined_airlines = pd.concat([top_5_airlines, bottom_5_airlines])

fig = go.Figure()

# bars for the top 5 airlines
fig.add_trace(
    go.Bar(
        x=top_5_airlines['Airline Name'],
        y=top_5_airlines['Overall Rating'],
        marker_color='skyblue',
        name="Top 5 Airlines"
    )
)

# bars for bottom 5 airlines
fig.add_trace(
    go.Bar(
        x=bottom_5_airlines['Airline Name'],
        y=bottom_5_airlines['Overall Rating'],
        marker_color='#B0C4DE',
        name="Bottom 5 Airlines"
    )
)

fig.update_layout(
    height=700,
    title_text="Top and Bottom 5 Airlines by Average Overall Rating",
    xaxis_title="Airline Name",
    yaxis_title="Average Overall Rating",
    showlegend=True,
    plot_bgcolor='rgba(0,0,0,0)'
)

fig.show()

The above analysis shows a clear distinction between the top 5 and bottom 5 airlines based on average of the total numerical ratings. The top 5 airlines exhibit significantly higher average rating, with ratings in the range of 29 to 30, compared to the bottom 5 with ratings from 5 to 8.


What are the common review and sentiments in passenger reviews for the top and bottom airline?

The following uses wordcloud text analysis to determine the frequent words used in the review title for the top and bottom airline. This will give us a clearer picture of what’s driving those high or low ratings.

In [12]:
# filter for StarFlyer and Arkefly reviews
StarFlyer_df = df[df['Airline Name'] == 'StarFlyer']
Arkefly_df = df[df['Airline Name'] == 'Arkefly']

# define custom stopwords
custom_stopwords = {'i','the','only','t','had','would', 'and', 'is', 'to', 'in', 'it', 'a', 'for', 'of', 'on', 'this', 
                    'that', 'with', 'as', 'at', 'by', 'are', 'was', 'an', 'be', 
                    'have', 'from', 'but', 'not', 'or', 'all', 'can', 'you', 
                    'your', 'my', 'if', 'I', 'me', 'he', 'she', 'we', 'they', 
                    'what', 'there', 'their', 'this', 'its', 'so', 'like', 
                    'no', 'yes', 'do', 'will', 'more', 'than', 'up', 'when', 'flight','starflyer','arkefly','very','were'}

# function to extract most common words
def get_common_words(df, num_words=10):
    # combine all reviews into a single string
    text = ' '.join(df['Review'])
    
    # remove punctuation and split into words
    words = re.findall(r'\w+', text.lower()) 
    filtered_words = [word for word in words if word not in custom_stopwords]  # exclude stopwords
    common_words = Counter(filtered_words).most_common(num_words)  # get the most common words
    return pd.DataFrame(common_words, columns=['Word', 'Frequency'])

# get common words for both airliens
common_words_starflyer = get_common_words(StarFlyer_df)
common_words_arkefly = get_common_words(Arkefly_df)

# plot common words for StarFlyer
fig1 = px.bar(common_words_starflyer, x='Word', y='Frequency',
               title='Most Common Words in StarFlyer Reviews',
               labels={'Word': 'Common Words', 'Frequency': 'Frequency'},
               color='Frequency', 
               color_continuous_scale=px.colors.sequential.Blues,
               template='plotly_white')

fig1.show()

# plot common words for Arkefly
fig2 = px.bar(common_words_arkefly, x='Word', y='Frequency',
               title='Most Common Words in Arkefly Reviews',
               labels={'Word': 'Common Words', 'Frequency': 'Frequency'},
               color='Frequency',
               color_continuous_scale=px.colors.sequential.Reds,
               template='plotly_white')

fig2.show()


Based on the StarFlyer results, common words like “service,” “tokyo,” “staff,” “leather,” “attendants,” “personal,” and “seat” give us a clear idea of why the airline has higher reviews. It shows that passengers are happy with the good service from the staff, the comfort of the leather seats, and that they appreciate the routes to major destinations like Tokyo.

For Arkefly, words like “seats,” “class,” “comfort,” “food,” “amsterdam,” “service,” and “premium” stand out. Since Arkefly has poor ratings, we can assume that passengers are not satisfied with the comfort, food quality, and overall service, especially when it comes to premium offerings. This suggests that Arkefly is not meeting expectations in these key areas.


How do different seat types fare in terms of ratings?

In [13]:
# group the dataset by Seat Type and calculate the mean for each service category rating
avg_ratings_seattype = df.groupby('Seat Type').agg(
    Seat_Comfort=('Seat Comfort', 'mean'),
    Cabin_Staff_Service=('Cabin Staff Service', 'mean'),
    Food_Beverages=('Food & Beverages', 'mean'),
    Ground_Service=('Ground Service', 'mean'),
    Inflight_Entertainment=('Inflight Entertainment', 'mean'),
    Wifi_Connectivity=('Wifi & Connectivity', 'mean'),
    Value_For_Money=('Value For Money', 'mean'),
    Review_Count=('Overall Rating', 'count')  # count number of reviews for each seat type
).reset_index().round(1)

# rename the columns to remove underscores
avg_ratings_seattype.rename(columns={
    'Seat_Comfort': 'Seat Comfort',
    'Cabin_Staff_Service': 'Cabin Staff Service',
    'Food_Beverages': 'Food Beverages',
    'Ground_Service': 'Ground Service',
    'Inflight_Entertainment': 'Inflight Entertainment',
    'Wifi_Connectivity': 'Wifi Connectivity',
    'Value_For_Money': 'Value For Money'
}, inplace=True)

avg_ratings_seattype
Out[13]:
Seat Type Seat Comfort Cabin Staff Service Food Beverages Ground Service Inflight Entertainment Wifi Connectivity Value For Money Review_Count
0 Business Class 3.3 3.6 3.1 3.0 2.6 2.2 3.1 2098
1 Economy Class 2.5 2.8 2.4 2.3 1.8 1.6 2.4 20241
2 First Class 3.5 3.7 3.1 3.1 2.4 2.3 3.2 186
3 Premium Economy 2.7 2.9 2.5 2.5 2.3 2.0 2.5 646
In [14]:
color_palette = [
    '#54bebe', 
    '#76c8c8', 
    '#98d1d1', 
    '#badbdb',  
    '#e4bcad',  
    '#df979e',  
    '#d7658b'  
]

# melt df for stacking
avg_ratings_seattype_melted = avg_ratings_seattype.melt(
    id_vars=['Seat Type', 'Review_Count'], 
    value_vars=['Seat Comfort', 'Cabin Staff Service', 'Food Beverages', 
                'Ground Service', 'Inflight Entertainment', 
                'Wifi Connectivity', 'Value For Money'],
    var_name='Service_Category', 
    value_name='Rating'
)

# stacked bar chart
fig = px.bar(
    avg_ratings_seattype_melted, 
    x='Seat Type', 
    y='Rating', 
    color='Service_Category',  # stack by service category
    title='Average Ratings by Seat Type and Service Category',
    labels={'Rating': 'Average Rating', 'Seat Type': 'Seat Type', 
            'Service_Category': 'Service Category'},
    hover_data={'Review_Count': True},  # show the count of reviews in the tooltip
    color_discrete_sequence=color_palette,
    template='plotly_white' 
)

# update x-axis to reflect the sorted order
fig.update_layout(xaxis=dict(categoryorder='total descending', categoryarray=avg_ratings_seattype['Seat Type'].tolist()))

fig.show()

The chart shows that first class has the highest average rating, while economy class is rated the lowest. However, economy has the most reviews, likely because it’s the most common seat type. First class, with fewer reviews, caters to a smaller group of travellers.

First class also seems to score the highest for all service category ratings, except for inflight entertainment, which was surpassed by business class. Economy class appears to score the lowest for all service category ratings. The higher volume of reviews for economy class could indicate a greater diversity of experiences, possibly leading to lower average ratings across service categories.


What factors influence whether a passenger recommends an airline?

In [15]:
# ensure 'Recommended' column is a string and handle NaN values
df['Recommended'] = df['Recommended'].fillna('').astype(str).str.strip().str.lower()

# convert 'Recommended' to binary format
df['Recommended'] = df['Recommended'].map({'yes': 1, 'no': 0})

# selecting relevant features
features = ['Seat Comfort', 'Cabin Staff Service', 'Food & Beverages', 
            'Ground Service', 'Inflight Entertainment', 'Wifi & Connectivity', 
            'Value For Money']

# corr matrix
corr_matrix = df[features + ['Recommended']].corr()

# heatmap
fig = px.imshow(corr_matrix,
                color_continuous_scale='Greens',  # Choose a color scale
                title='Correlation Matrix',
                labels=dict(x='Features', y='Features', color='Correlation Coefficient'),
                x=corr_matrix.columns,
                y=corr_matrix.columns)

fig.show()

Based on the heatmap results, we can see that value for money and ground service are the top factors influencing whether a passenger will recommend an airline.


What is the % of verified vs unverified reviews, and how does the verification status affect the ratings?

In [16]:
# calculate overall percentage of verified reviews
total_reviews = len(df)
verified_reviews = df['Verified'].sum()
non_verified_reviews = total_reviews - verified_reviews

# calculate average overall rating for verified and non-verified reviews
average_rating_verified = df[df['Verified'] == True]['Overall Rating'].mean().round(1)
average_rating_non_verified = df[df['Verified'] == False]['Overall Rating'].mean().round(1)

# data for pie chart
labels = ['Verified', 'Non-Verified']
sizes = [verified_reviews, non_verified_reviews]
avg_ratings = [average_rating_verified, average_rating_non_verified]

# pie chart
pie_data = pd.DataFrame({
    'Label': labels,
    'Count': sizes,
    'Average Rating': avg_ratings
})

# subplots
fig = make_subplots(
    rows=1, cols=2, 
    subplot_titles=("Verified vs Non-Verified Reviews", "Distribution of Ratings by Verification Status"),
    specs=[[{'type': 'pie'}, {'type': 'box'}]]  # Specify types for each subplot
)

# add pie chart
fig.add_trace(
    go.Pie(
        labels=pie_data['Label'],
        values=pie_data['Count'],
        hole=0.3,  # Optional: to create a donut chart
        textinfo='label+percent',
        hovertemplate="<b>%{label}</b><br>Number of Reviews: %{value}<br>Average Rating: %{customdata[0]:.1f}",
        customdata=pie_data['Average Rating'],
        marker=dict(colors=['#A0C4E1', '#7ED8B4'])
    ),
    row=1, col=1
)

# add box plot
fig.add_trace(
    go.Box(
        y=df['Overall Rating'],
        x=df['Verified'],
        jitter=0.5,  # Random jitter to avoid overlap
        pointpos=-1.8,  # Position of points
        name='Overall Rating',
        marker=dict(color='#A0C4E1'),
        boxmean=True,  # Add mean line,
    ),
    row=1, col=2
)

fig.update_layout(
    plot_bgcolor='white',  # Background color of the plot area
    paper_bgcolor='white',  # Background color of the entire figure
    xaxis=dict(
        tickvals=[True, False], 
        ticktext=['Verified', 'Non-Verified'],
        gridcolor='lightgrey'  # Lighter grey for x-axis grid lines
    ),
    yaxis=dict(
        gridcolor='lightgrey'  # Lighter grey for y-axis grid lines
    )
)

fig.show()

From the pie chart, we can observe that verified reviews make up slightly more at 53.2%. However, despite having more verified reviews, the average rating for these is lower by 0.2 compared to non-verified reviews. The box plot shows that the overall distribution of ratings between the two groups is quite similar, though the verified reviews have a lower Q1 value by 1.

This suggests that while most reviews are verified, the verification process doesn’t necessarily translate into higher ratings, and the range of ratings is still quite comparable.


Export data to csv, to be used on Tableau

In [17]:
df.to_csv('revised_airline_review.csv', index=False)